Constant-Time Markov Tracking for Sequential POMDPs

نویسنده

  • Richard Washington
چکیده

Introduction Stochastic representations of a problem domain can capture aspects that are otherwise difficult to model, such as errors, alternative outcomes of actions, and uncertainty about the world. Markov models (Bellman 1957) are a popular choice for constructing stochastic representations. The classic MDP, however, does not account for uncertainty in the process state. Often this state is known only indirectly through sensors or tests. To account for the state uncertainty, MDPs have been extended to partially observable MDPs (POMDPs) (Lovejoy 1991; Monahan 1982; Washington 1996). this model, the underlying process is an MDP, but rather than supplying exact state information, the process produces one of a set of observations. The major drawback of POMDPs is that finding an optimal plan is computationally intractable for realistic problems. We are inter~ted in seeing what limitations may be imposed that allow on-line computation. We have developed the approach of Markov Tracking (Washington 1997), which chooses actions that coordinate with a process rather than influence its behavior. It uses the POMDP model to follow the agent’s state, and reacts optimally to it. In this paper we discuss the application of Markov Tracking to a subclass of problems called sequential POAfDPs. Within this class the optimal action is not only computed locally, but in constant time, allowing true on-line performance with large-scale problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Particle Filtering Algorithm for Interactive POMDPs

Interactive POMDP (I-POMDP) is a stochastic optimization framework for sequential planning in multiagent settings. It represents a direct generalization of POMDPs to multiagent cases. Expectedly, I-POMDPs also suffer from a high computational complexity, thereby motivating approximation schemes. In this paper, we propose using a particle filtering algorithm for approximating the I-POMDP belief ...

متن کامل

Stochastic Local Search for POMDP Controllers

The search for finite-state controllers for partially observable Markov decision processes (POMDPs) is often based on approaches like gradient ascent, attractive because of their relatively low computational cost. In this paper, we illustrate a basic problem with gradient-based methods applied to POMDPs, where the sequential nature of the decision problem is at issue, and propose a new stochast...

متن کامل

Hilbert Space Embeddings of PSRs

Many problems in machine learning and artificial intelligence involve discrete-time partially observable nonlinear dynamical systems. If the observations are discrete, then Hidden Markov Models (HMMs) (Rabiner, 1989) or, in the control setting, Partially Observable Markov Decision Processes (POMDPs) (Sondik, 1971) can be used to represent belief as a discrete distribution over latent states. Pr...

متن کامل

A Framework for Optimal Sequential Planning in Multiagent Settings

Introduction Research in autonomous agent planning is gradually moving from single-agent environments to those populated by multiple agents. In single-agent sequential environments, partially observable Markov decision processes (POMDPs) provide a principled approach for planning under uncertainty. They improve on classical planning by not only modeling the inherent non-determinism of the probl...

متن کامل

Believing in POMDPs

Partially observable Markov decision processes (POMDP) are well-suited for realizing sequential decision making capabilities that respect uncertainty in Companion systems that are to naturally interact with and assist human users. Unfortunately, their complexity prohibits modeling the entire Companion system as a POMDP. We therefore propose an approach that makes use of abstraction to enable em...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002